-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic emulation of getcwd/chdir #214
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This looks great. Should make a lot more programs runnable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Here's an initial round of coments. I haven't yet figured out what to think about calling malloc+free on every open/stat/etc.; see below for details.
This commit adds basic emulation of a current working directory to wasi-libc. The `getcwd` and `chdir` symbols are now implemented and available for use. The `getcwd` implementation is pretty simple in that it just copies out of a new global, `__wasilibc_cwd`, which defaults to `"/"`. The `chdir` implementation is much more involved and has more ramification, however. A new function, `make_absolute`, was added to the preopens object. Paths stored in the preopen table are now always stored as absolute paths instead of relative paths, and initial relative paths are interpreted as being relative to `/`. Looking up a path to preopen now always turns it into an absolute path, relative to the current working directory, and an appropriate path is then returned. The signature of `__wasilibc_find_relpath` has changed as well. It now returns two path components, one for the absolute part and one for the relative part. Additionally the relative part is always dynamically allocated since it may no longer be a substring of the original input path. This has been tested lightly against the Rust standard library so far, but I'm not a regular C developer so there's likely a few things to improve!
Ok I've pushed up a second commit which should amortize the calls to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusing the malloc buffer looks good!
This patch does increase code size of simple programs using open
by about 4K. While that's within noise for some users, it's noticeable for others.
I'm considering whether we should move cwd emulation into a separate library. Similar to how emulated mmap
requires compiling with -D_WASI_EMULATED_MMAN
and linking with -lwasi-emulated-mman
, we could make getcwd
and chdir
require compiling with -D_WASI_EMULATED_CWD
and linking with -lwasi-emulated-cwd
. Implementing this is mostly Makefile mechanics, which I can do separately, so we don't need to block this PR on it, but I am interested in whether anyone has opinions about this approach.
This is something I'd like to enable by default with rust-lang/rust, so if binary size is a concern I think it would be better to handle this for every program rather than just those that don't opt-in. I'd also imagine that almost all programs would want to opt-in given how prevalent the concept of a current directory is. I feel like though this sort of a consequence of the wasi-libc strategy, which is that if we want to put stuff in wasm rather than WASI APIs that's inevitably going to make modules larger. Is there a way to do weak linking tricks or something like that to only pull in this extra processing when |
I think that might be possible yes. I think you would want to co-locate |
Perhaps if its very common then it could be opt-out instead, so tiny programs could link with |
Ok so I'm having a really hard time wrangling weak symbols to do what I want. Inevitably they never seem to work for me... First I refactored a bit where I defined a bunch of weak functions in places and then defined a copy of each function in Next I refactored everything so the meat was defined in Basically nothing I did was able to actually work. No matter what I did the linker kept pulling in the strong versions defined in I looked a bit though and of the 4k increase it looks like 3k is due to I'm not really sure how much this is all worth it in these sense of any nontrivial application is almost surely going to have |
This part is confusing me. A weak reference to symbol in Are you declaring the function like this within
? To aid debugging you can try |
Oh.. I see the problem. The confusion is between weak references and weakly defined symbol. They are confusingly quite different and have completely different uses. In this case you want weak references to strongly (normal) defined functions. I think of it like this:
We really should not have re-used the word weak for these two different concepts. |
In int __wasilibc_find_relpath_alloc(
const char *path,
const char **abs,
char **relative,
size_t *relative_len,
int can_realloc
) __attribute__((weak)); and it's called like: if (__wasilibc_find_relpath_alloc)
return __wasilibc_find_relpath_alloc(path, abs_prefix, relative_path, &relative_path_len, 0); and in
|
(I can gist the full patch too if it helps, it's just not 100% finished yet) |
Can you try |
That just prints out
|
Ok so poking around a bit more, if I move @sbc100 is this perhaps a bug in the |
Certainly sounds very odd. Are there any symbols that those object file both define? Otherwise I can't think of reason why ordering within an archive could/should matter. I would love to take a look at a repro for this.. do you repro steps you could include here? |
Ok I've pushed up a commit which I think does the weak symbol business. The locally of the example program increases from 38686 to 39021, an increase of 335 bytes. @sbc100 there's some makefile tweaks here, and if those makefile tweaks are removed then this example program increases 4k in size (pulls in |
wasm-ld fix: https://reviews.llvm.org/D85567 |
Nice! I've updated comments in the |
…e) version is see first When a weak reference of a lazy symbol occurs we were not correctly updating the lazy symbol. We need to tag the existing lazy symbol as weak and, in the case of a function symbol, give it a signature. Without the signature we can't then create the dummy function which is needed when an weakly undefined function is called. We had tests for weakly referenced lazy symbols but we were only tests in the case where the reference was seen before the lazy symbol. See: WebAssembly/wasi-libc#214 Differential Revision: https://reviews.llvm.org/D85567
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for putting together all the pieces for this! Just a few more comments:
Sorry for the delay, but should be updated now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
…e) version is see first When a weak reference of a lazy symbol occurs we were not correctly updating the lazy symbol. We need to tag the existing lazy symbol as weak and, in the case of a function symbol, give it a signature. Without the signature we can't then create the dummy function which is needed when an weakly undefined function is called. We had tests for weakly referenced lazy symbols but we were only tests in the case where the reference was seen before the lazy symbol. See: WebAssembly/wasi-libc#214 Differential Revision: https://reviews.llvm.org/D85567
…e) version is see first When a weak reference of a lazy symbol occurs we were not correctly updating the lazy symbol. We need to tag the existing lazy symbol as weak and, in the case of a function symbol, give it a signature. Without the signature we can't then create the dummy function which is needed when an weakly undefined function is called. We had tests for weakly referenced lazy symbols but we were only tests in the case where the reference was seen before the lazy symbol. See: WebAssembly/wasi-libc#214 Differential Revision: https://reviews.llvm.org/D85567
This commit adds basic emulation of a current working directory to
wasi-libc. The
getcwd
andchdir
symbols are now implemented andavailable for use. The
getcwd
implementation is pretty simple in thatit just copies out of a new global,
__wasilibc_cwd
, which defaults to"/"
. Thechdir
implementation is much more involved and has moreramification, however.
A new function,
make_absolute
, was added to the preopens object. Pathsstored in the preopen table are now always stored as absolute paths
instead of relative paths, and initial relative paths are interpreted as
being relative to
/
. Looking up a path to preopen now always turns itinto an absolute path, relative to the current working directory, and an
appropriate path is then returned.
The signature of
__wasilibc_find_relpath
has changed as well. It nowreturns two path components, one for the absolute part and one for the
relative part. Additionally the relative part is always dynamically
allocated since it may no longer be a substring of the original input
path.
This has been tested lightly against the Rust standard library so far,
but I'm not a regular C developer so there's likely a few things to
improve!